A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence
نویسندگان
چکیده
This paper proposes a new speech synthesis method for high-quality Japanese TTS(Text-to-speech) based on the waveform synthesis. The method uses V-CV as a basic synthesis unit to preserve the intelligibility of consonant. An efficient unit reconstruction method is newly adopted both to minimize pitch conversion and concatenation distortion when selecting waveforms. The minimization can provide fluency for synthesized speech. Furthermore, the proposed method enables to make a compact waveform dictionary keeping with high quality of synthesized speech. Using the waveform generation function of the method, the size of waveform dictionary can be drastically reduced by 1/40. Experimental evaluation using 32 ordinary peoples showed that high intelligibility of 97% was attained by the proposed V-CV speech synthesis method.
منابع مشابه
Automatic generation of speech synthesis units based on closed loop training
This paper proposes a new method for automatically generating speech synthesis units. A small set of synthesis units is selected from a large speech database by the proposed Closed-Loop Training method (CLT). Because CLT is based on the evaluation and minimization of the distortion caused by the synthesis process such as prosodic modi cation, the selected synthesis units are most suitable for s...
متن کاملA very low bit rate speech coder based on a recognition/synthesis paradigm
Recent studies have shown that a concatenative speech synthesis system with a large database produces more natural sounding speech. We apply this paradigm to the design of improved very low bit rate speech coders (sub 1000 b/s). The proposed speech coder consists of unit selection, prosody coding, prosody modification and waveform concatenation. The encoder selects the best unit sequence from a...
متن کاملEffect of vowel on pitch discrimination of CV in Japanese
The pitch dynamics seem to be based on the pitch thresholds (DLs) between the Vowels, V and CV, and also CV and CV. In this study, the DLs were measured for discrimination of pitch (fundamental frequency) of complex tone, Japanese Vowel (V) and Consonance-Vowel (CV) speech. A nominal F0 of complex tone was 170 Hz and the pitch of V and CV speech were resynthesized same as 170Hz. Tone was band-p...
متن کاملDevelopment of Syllable Based Unit Selection Text- To-Speech Synthesis System for Tamil Using Three Level Fall Back Technique
A text-to-speech synthesis system is one that is capable of producing intelligible and natural speech corresponding to any given text. A popular approach to speech synthesis is unit selection synthesis (USS). The current work focuses on developing a USS system for Tamil. Literature suggests that syllable is a suitable unit for Indian languages. Creating a database that covers all the syllables ...
متن کاملAn automatic pitch-marking method using wavelet transform
This paper describes a new automatic pitch-marking method using wavelet transform. This method detects discontinuity in the speech waveform which occurs at the glottal closure instant (GCI). A time domain prosodic modification technique requires an appropriate determination of the synthesis pitch-marks. We evaluated the performance of the newly developed pitchmarking method by using our interna...
متن کامل